Supervised Learning of Complete Morphological Paradigms
نویسندگان
چکیده
We describe a supervised approach to predicting the set of all inflected forms of a lexical item. Our system automatically acquires the orthographic transformation rules of morphological paradigms from labeled examples, and then learns the contexts in which those transformations apply using a discriminative sequence model. Because our approach is completely data-driven and the model is trained on examples extracted from Wiktionary, our method can extend to new languages without change. Our end-to-end system is able to predict complete paradigms with 86.1% accuracy and individual inflected forms with 94.9% accuracy, averaged across three languages and two parts of speech.
منابع مشابه
Paradigm classification in supervised learning of morphology
Supervised morphological paradigm learning by identifying and aligning the longest common subsequence found in inflection tables has recently been proposed as a simple yet competitive way to induce morphological patterns. We combine this non-probabilistic strategy of inflection table generalization with a discriminative classifier to permit the reconstruction of complete inflection tables of un...
متن کاملSequence Tagging for Verb Conjugation in Romanian
Verbs in Romanian sometimes manifest local irregularities in the form of alternating letters. We present a sequence tagging based method for learning stem alternations and ending sequences. Supervised training is based on a morphological dictionary, with a few regular expression paradigms encoded by hand. Our best model improves upon previous machine learning approaches to Romanian verb conjuga...
متن کاملModeling English Past Tense Intuitions with Minimal Generalization
We describe here a supervised learning model that, given paradigms of related words, learns the morphological and phonological rules needed to derive the paradigm. The model can use its rules to make guesses about how novel forms would be inflected, and has been tested experimentally against the intuitions of
متن کاملSemi-supervised learning of morphological paradigms and lexicons
We present a semi-supervised approach to the problem of paradigm induction from inflection tables. Our system extracts generalizations from inflection tables, representing the resulting paradigms in an abstract form. The process is intended to be language-independent, and to provide human-readable generalizations of paradigms. The tools we provide can be used by linguists for the rapid creation...
متن کاملAssigning Inflectional Paradigms to Named Entities by Linear Successive Abstraction
This paper describes how a supervised learning method is used for assigning inflectional paradigms to organizational named entities as the main prerequisite for generating a morphological lexicon of these entities. An inflectional paradigm consists of a set of rules for generating all forms of a lexicon entry. A morphological lexicon consists of lexicon entries and their corresponding forms. Th...
متن کامل